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Abstract 

We investigated the role of common genetic variation in immune-related genes on breast cancer disease-free survival (DFS) 
in Korean women. 1 07 breast cancer patients of the Seoul Breast Cancer Study (SEBCS) were selected for this study. A total 
of 2,432 tag single nucleotide polymorphisms (SNPs) in 283 immune-related genes were genotyped with the GoldenGate 
Oligonucleotide pool assay (OPA). A multivariate Cox-proportional hazard model and polygenic risk score model were used 
to estimate the effects of SNPs on breast cancer prognosis. Harrell's C index was calculated to estimate the predictive 
accuracy of polygenic risk score model. Subsequently, an extended gene set enrichment analysis (GSEA-SNP) was 
conducted to approximate the biological pathway. In addition, to confirm our results with current evidence, previous 
studies were systematically reviewed. Sixty-two SNPs were statistically significant at p-value less than 0.05. The most 
significant SNPs were rs1 952438 in SOCS4 gene (hazard ratio (HR) = 11.99, 95% CI = 3.62-39.72, P = 4.84E-05), rs2289278 in 
7SZ.P gene (HR = 4.25, 95% CI = 2.1 0-8.62, P = 5.99E-05) and rs2074724 in HGF gene (HR = 4.63, 95% CI = 2.1 8-9.87, P = 7.04E- 
05). In the polygenic risk score model, the HR of women in the 3 rd tertile was 6.78 (95% CI = 1.48-31.06) compared to 
patients in the 1 st tertile of polygenic risk score. Harrell's C index was 0.813 with total patients and 0.924 in 4-fold cross 
validation. In the pathway analysis, 18 pathways were significantly associated with breast cancer prognosis (P<0.1j. The IL- 
6R, IL-8, IL-10RB, IL-12A, and IL-12B was associated with the prognosis of cancer in data of both our study and a previous 
study. Therefore, our results suggest that genetic polymorphisms in immune-related genes have relevance to breast cancer 
prognosis among Korean women. 
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Introduction 

Cancer is a significant health problem in many parts of the 
worldwide [1,2]. In Korea, the incidence rate of breast cancer was 
ranked second and the mortality rate fifth in Korean women, 
which steadily increased from 1983 to 2010 [3]. The etiology and 
progression of breast cancer is a multiple-step process caused by 
combining many factors which involve environmental, hormonal 
and genetic factors [4,5] . We focused on genetic factors involved in 
immune response which was known to play a role in breast cancer 
prognosis. 

The association of immune markers with breast cancer 
prognosis were well known and the role as key factor of 
microenvironment of tumor such as tumor suppressor or growth. 
For example, high density of CD68 which is high-infiltration of 
tumor-associated macrophages was related with poorer outcome 
in node-negative breast cancer [6] and CD44 positive patients 
showed longer overall survival and progression free survival than 



CD44 negative patients [7]. In addition, cytokines produced by 
various immune cells were known to modulate the transition from 
the innate to the adaptive immune response, the activation of anti- 
tumor cells, persistent oxidative stress, and the angiogenesis of 
breast cancer [8-10]. The prognosis of breast cancer was also 
known to be associated with single nucleotide polymorphisms 
(SNPs) in the immune system related genes [11-14]. Those reports 
described that genetic variants of toll-like receptor 4 (TLR4), 
interleukin 1 2 (IL-12), interleukin 2 (IL-2), and interleukin 6 (IL-6) 
were related with breast cancer prognosis. However, there have 
been few studies that investigate the association between 
comprehensive list of variants in the immunity-related genes and 
the prognosis of breast cancer. 

Given the findings that immune system is related with breast 
cancer prognosis, we hypothesized that many genetic polymor- 
phisms in immune related genes might be prognostic factor of 
breast cancer recurrence. In this study, the role of common 
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immune genetic variations to the disease free survival (DFS) of 
breast cancer was investigated with the multivariate Cox- 
proportional hazard model by individual variants, polygenic risk 
score model, and an extended gene set enrichment analysis. 
Additionally, a systematic review of previous literature that had 
reported on the associations between variants of the immunity- 
related genes and the prognosis of various cancers was done. 

Materials and Methods 

Study population 

Among subjects of Seoul Breast Cancer Study (SEBCS), a 
multicenter based case-control study recruiting between 200 1 and 
2007, the participants in this study were patients diagnosed with 
histologically confirmed breast cancer in the Seoul National 
University Hospital during 2002-2004. Based on the sample 
availability and quality of DNA, 140 breast cancer patients were 
successfully genotyped [15]. Among them, 107 patients were 
included in the final analysis after excluding patients without 
survival status or clinical information or been diagnosed as 
metastatic breast cancer patients. 

During recruitment, well-trained interviewers provided patients 
with informed consent forms and collected information with a 
structured questionnaire. Through abstracting the medical chart, 
information on survival status, hormone receptor status, and TNM 
stage [16] were obtained. 

This study design was approved by the Committee on Human 
Research of Seoul National University Hospital (IRB No. H-0503- 
144-004). 

Genotyping 

Among 209 samples met the genotyping criteria (concentration 
>7.5 ng/ul and total amount of DNA >750 ng), 140 cases were 
successfully genotyped. 283 immune-related candidate genes were 



composed of 190 innate immune-related genes in innate immune 
oligonucleotide pool assay (OPA) chip and 93 adaptive immune- 
related genes in Non-Hodgkin's lymphoma (NHL) OPA chip as 
described in previous study [15,17]. 2,432 Tags SNPs were 
selected with SNP500 Cancer project database considering the site 
from 20 kb upstream of the first site of transcription of a candidate 
gene to 1 0 kb downstream of the end site of the last exon of the 
candidate gene and genotyped. Among them, 461 SNPs were 
excluded from the analysis because of low minor allele frequency 
(MAF) (<3%) and deviation from Hardy-Weinberg Equilibrium 
(HWE) (P<10" + ). Finally, a total of 1,971 SNPs in 279 immunity 
genes were selected for the analysis. 

Statistical method 

A DFS was calculated from the date when patients underwent a 
breast cancer operation to the date of last follow-up or recurrence, 
such as loco-regional, distant, contralateral recurrence and death 
from any causes. If patients had no evidence of recurrence, they 
were censored at the last follow-up date or on June 30, 201 1. The 
median follow-up time was 4.87 years (range, 0.25-6.72 years). 

Demographic data including age (<50 and &50), body mass 
index (BMI) (<21.4 and &21.4), family history of breast cancer in 
1 st and 2 nd relatives (no and yes), educational level (S middle 
school, high school, and & college or university), smoking status 
(never and ever), alcohol consumption (never and ever), and 
menopausal status (premenopausal and postmenopausal), and 
clinicopathological data including estrogen receptor status (ER) 
(positive and negative), progesterone receptor status (PR) (positive 
and negative), and 7 th AJCC TNM stage (I, II, and III) were 
assessed for DFS with the log-rank test and univariate Cox- 
proportional hazard model. Multivariate Cox-proportional hazard 
model adjusted for age, ER status, PR status, and TNM stage (I, 
II, and III) was used to calculate the hazard ratio (HR) and their 
95 % CI of the effect for each SNP on the DFS of breast cancer 
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Figure 1. Associations of the polygenic risk score on breast cancer disease free survival. Kaplan-Meier survival curve and estimated 
hazard ratios (HRs) of breast cancer in groups defined by tertile derived from the polygenic risk scores of the 107 patients with all 62 SNPs. 
doi:10.1371/journal.pone.0103593.g001 
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Figure 2. Overview of inclusion and exclusion criteria in systematic review. 

doi:10.1371/journal.pone.0103593.g002 



based on additive genetic models. If SNPs were located in the 
same candidate gene and these SNPs had a linkage disequilibrium 
(LD) (r 2 >0.4), the most significantly associated SNP were selected. 
To correct the multiple comparison, false discovery rate (FDR) p- 
values were calculated with the Benjamin-Hochberg method [18]. 

For the polygenic risk score method, the polygenic risk score 
was calculated by adding the number of risk alleles in each patient 
based on individual SNP analyses and the patients were 
categorized into tertiles of polygenic risk score [19]. HR and 
95% confidence intervals (CIs) per tertile of polygenic risk score 
were calculated. After analyzing multivariate Cox-proportional 
hazard model, Harrell's C index was calculated to evaluate 
predictive accuracy of polygenic risk score model [20]. In addition, 
4-fold cross-validation method was used to appraise the internal 
validity of our model; the entire data set was randomly partitioned 
into 4 equal size subsets. Of the 4 subsets, 3 subsets were used as 
training data, and a remaining single subset was retained as the 
validation data for testing the model. Significantly associated SNPs 
with prognosis of breast cancer were firstly estimated in training 



set and then Harrell's C index was estimated based on those SNPs 
in validation set. The cross-validation process was then repeated 4 
times. The summary of these 4 Harrell's indices was assessed by 
fixed-effect model meta-analysis. 

The GSEA-SNP method was used to reveal the biological 
function of the SNPs which were significantly related to breast 
cancer prognosis [21]. Pathway information was obtained from 
the Molecular Signatures Database (MSigDB) which collected 
annotated gene sets from the following online databases; BioCarta, 
KEGG, Pathway Interaction Database, Reactome, SigmaAldrich, 
Signaling Gateway, Signal Transduction KE, and SuperArray. In 
addition, gene sets that have been extracted from experimental 
studies were included in the database. The curated gene sets were 
downloaded from MSigDB (version 4.0, C2). Because there was a 
chance of the biological pathway being narrowly defined, each 
pathway was set up to contain at least three genes in the following 
analyses. The names of gene sets were described with 'brief 
description' rather than 'standard name' which is available on the 
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Table 1. Characteristic of study participants. 





Characteristics 


No. of patients (%) 


No. of events (%) 


P 


HR b 


(95% CI) 




Total 


107 (100.0) 


20 (100.0) 










Age (Mean ± SD) 


50.6±8.2 


52.5 ±10.6 


0.60 








<50 


54 (50.5) 


11 (55.0) 




1.00 






>50 


53 (49.5) 


9 (45.0) 




0.79 


(0.33-1.91) 


0.60 


Body mass index (Mean ± SD) 


23.7±2.9 


24.4±2.13 


<0.02 








<21.4 (median) 


30 (33.3) 


1 (5.0) 




1.00 






£21.4 


77 (66.7) 


1 9 (95.0) 




7.30 


(0.98-54.61) 


0.05 


Family history 






1.00 








No 


97 (90.7) 


1 8 (90.0) 




1.00 






Yes 


10 (9.3) 


2 (10.0) 




1.00 


(0.23-4.37) 


1.00 


Educational level 






0.46 








sMiddle school 


30 (28.3) 


4 (20.0) 




1.00 






High school 


46 (43.4) 


11 (55.0) 




1.95 


(0.62-6.13) 


0.26 


^College or university 


30 (28.3) 


5 (25.0) 




1.27 


(0.34-4.73) 


0.72 


Menopausal status 






0.71 








Premenopausal 


62 (58.5) 


11 (55.0) 




1.00 






Postmenopausal 


44 (41.5) 


9 (45.0) 




1.18 


(0.49-2.84) 


0.72 


Smoking status 






0.10 








Never 


100 (93.5) 


1 7 (85.0) 










Ever 


7 (6.5) 


3 (15.0) 




2.70 


(0.78-9.17) 


0.12 


Alcohol consumption 






0.66 








Never 


70 (65.4) 


14 (70.0) 










Ever 


37 (34.6) 


6 (30.0) 




0.81 


(0.31-2.10) 


0.66 


Estrogen receptor status 






0.07 








Positive 


66 (62.3) 


9 (45.0) 




1.00 






Negative 


40 (37.7) 


11 (55.0) 




2.19 


(0.90-5.28) 


0.08 


Progesterone receptor status 






0.01 








Positive 


53 (50.5) 


5 (25.0) 




1.00 






Negative 


52 (49.5) 


15 (75.0) 




3.39 


(1.23-9.37) 


0.02 


TNM stage 






<0.01 








0/1 


48 (45.3) 


4 (20.0) 




1.00 






II 


40 (37.7) 


7 (35.0) 




2.20 


(0.64-7.56) 


0.21 


III 


18 (17.0) 


9 (50.0) 




8.54 


(2.62-27.88) 


<0.01 



a Log rank test. 

Univariate Cox-proportional hazard model. 
doi:10.1371/journal.pone.0103593.t001 



GSEA web (http://www.broadinstitute.org/gsea/index.jsp), be- 
cause standard name equivocally explained function of gene set. 

The statistical significance of the effects was estimated with a p- 
value less than 0.05 in both multivariate Cox-proportional hazard 
model by individual variants and polygenic risk score models and 
0.1 in GSEA-SNP. The SAS statistical software package version 
9.3, PLINK program version 1.07, and R 2.15.1 packages 
(GenABEL), STATA statistical software version 12.0 were used 
for the analyses. 

Systematic review 

Previous studies conducting analyses to find associations 
between immunity-related genetic factors and the prognosis of 
cancer in the epidemiologic field were selected for Jan 2000 
through Dec 2013 (Figure 2). Available studies for systematic 
review were searched in the PubMed and EMBASE database with 



a set of keywords that delineated breast cancer as well as other 
cancers, immune, genetic factors, and survival; cancer AND 
immune AND polymorphism AND survival. Abstracts were 
reviewed to identify reports examining associations between 
immunity-related genetic factors and clinical outcomes including 
recurrence and death. Literatures were excluded in the following 
circumstances; review paper, studies unrelated with genomic 
epidemiology, using SNPs located in non-immune related genes, 
duplicated in both databases, with no survival or recurrence data 
reported for survival analysis and no hazard ratios (HRs) reported 
which were estimated with the Cox-proportional hazard model for 
the associations of immunity-related genetic factors with cancer 
outcomes (Figure 2). In cases of duplication between both 
databases, the studies were deemed to have been searched in the 
PubMed database. The following data were extracted from each 
eligible study from the literature; disease site, authors, genes 
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Table 2. Associations between the genetic variations of immunity-related genes and breast cancer disease free survival in the 
additive model (significance level, P<5.00E-02). 



Gene 


Location 


SNP 


MAF 


HR" 


(95% CI) 


P 


S0CS4 


intronic 


rsl 952438 


0.04 


11.99 


(3.62-39.72) 


4.84E-05 


TSLP 


UTR5 


rs2289278 


0.15 


4.25 


(2.10-8.62) 


5.99E-05 


HGF 


intronic 


rs2074724 


0.11 


4.63 


(2.18-9.87) 


7.04E-05 


IL-17C 


intronic 


rs2254073 


0.15 


4.24 


(1.90-9.49) 


4.31E-04 


BCL2 


intergenic 


rs9989529 


0.19 


3.80 


(1.63-8.84) 


1 .98E-03 


ecu 


intergenic 


rsl 7652343 


0.08 


4.57 


(1.74-11.97) 


2.01 E-03 


ITG82 


intronic 


rs2838727 


0.04 


6.57 


(1.84-23.44) 


3.70E-03 


TRAF2 


intergenic 


rs908831 


0.14 


3.79 


(1.54-9.36) 


3.79E-03 


NBN 


downstream 


rs2142097 


0.42 


3.55 


(1.48-8.49) 


4.40E-03 


SELE 


intergenic 


rs4656701 


0.35 


0.28 


(0.11-0.71) 


7.41 E-03 


CCRI 


downstream 


rs31 36671 


0.19 


3.05 


(1.33-7.00) 


8.47E-03 


HGF 


intronic 


rs5745752 


0.33 


0.29 


(0.11-0.73) 


9.22E-03 


IL-12A 


intergenic 


rs9811792 


0.31 


0.23 


(0.08-0.71) 


1.01E-02 


MIF 


ncRNA_exonic 


rsl 007888 


0.41 


2.39 


(1.22-4.67) 


1.11E-02 


ITGB2-AS1 


ncRNA_exonic 


rs2070946 


0.12 


2.98 


(1.28-6.93) 


1.11E-02 


MIF 


ncRNAJntronic 


rs2000466 


0.18 


3.37 


(1.32-8.60) 


1.12E-02 


AL0XE3 


intronic 


rs3027215 


0.07 


3.17 


(1.28-7.87) 


1 .27E-02 


IFNAR2 


intronic 


rs2073362 


0.15 


3.86 


(1.33-11.17) 


1 .28E-02 


XDH 


intergenic 


rs10490361 


0.46 


0.44 


(0.23-0.84) 


1.35E-02 


CCL8 


intergenic 


rs3 138034 


0.07 


3.59 


(1.29-9.96) 


1 .42E-02 


S0CS2 


intronic 


rs3782415 


0.48 


2.38 


(1.18-4.83) 


1 .60E-02 


DEF6 


intronic 


rs6938946 


0.34 


2.26 


(1.16-4.39) 


1 .68E-02 


ABHD16A 


intronic 


rs2295663 


0.10 


2.55 


(1.16-5.59) 


1.93E-02 


LBP 


intronic 


rs 12624843 


0.30 


0.33 


(0.13-0.84) 


2.03E-02 


IL-18 


intergenic 


rs243908 


0.33 


3.59 


(1.22-10.61) 


2.05E-02 


IL-WRB 


UTR3 


rsl 058867 


0.32 


2.62 


(1.14-6.04) 


2.33E-02 


IL-6R 


intergenic 


rsl 1265608 


0.04 


4.15 


(1.21-14.21) 


2.36E-02 


IRAK4 


intronic 


rs4251460 


0.11 


2.78 


(1.15-6.73) 


2.38E-02 


TRAF5 


intronic 


rs6684874 


0.29 


0.29 


(0.10-0.85) 


2.46E-02 


MIF 


ncRNAJntronic 


rsl 7004044 


0.17 


0.23 


(0.06-0.83) 


2.48E-02 


XDH 


intronic 


rsl 429372 


0.38 


0.43 


(0.20-0.91) 


2.70E-02 


LMANI 


intronic 


rsl 2953981 


0.41 


0.41 


(0.19-0.91) 


2.74E-02 


AL0XE3 


intronic 


rs3027208 


0.43 


0.44 


(0.21-0.91) 


2.76E-02 


can 


intergenic 


rs4795904 


0.08 


3.11 


(1.13-8.56) 


2.81 E-02 


IL-I2B 


intergenic 


rs4921468 


0.22 


2.54 


(1.10-5.87) 


2.85E-02 


IL-4R 


UTR3 


rs8832 


0.42 


0.39 


(0.17-0.91) 


2.85E-02 


IL-I2A 


intergenic 


rs747825 


0.15 


0.10 


(0.01-0.79) 


2.90E-02 


SCNNIA 


intronic 


rs3759324 


0.36 


2.10 


(1.07-4.14) 


3.03E-02 


ITGB2 


intronic 


rsl 474552 


0.23 


0.26 


(0.08-0.88) 


3.06E-02 


C6 


intronic 


rsl 31 68926 


0.40 


0.40 


(0.18-0.92) 


3.08E-02 


FGF2 


intergenic 


rs308447 


0.08 


2.89 


(1.09-7.65) 


3.25E-02 


11-10 


intronic 


rs3021094 


0.42 


0.40 


(0.1 7-0.93) 


3.26E-02 


SELE 


intergenic 


rs4656699 


0.20 


0.31 


(0.11-0.92) 


3.41 E-02 


STKI9 


intronic 


rs389883 


0.26 


1.96 


(1.05-3.67) 


3.46E-02 


STAT4 


intronic 


rsl 03 1509 


0.31 


0.43 


(0.19-0.94) 


3.53E-02 


NCF4 


intronic 


rs2075938 


0.39 


2.17 


(1.05-4.51) 


3.66E-02 


SLC2AII 


intergenic 


rs 1984309 


0.39 


0.44 


(0.20-0.95) 


3.68E-02 


BPI 


intronic 


rs2275954 


0.40 


2.19 


(1.05-4.59) 


3.70E-02 


TNFRSFIA 


intronic 


rs41 49577 


0.41 


0.37 


(0.14-0.94) 


3.70E-02 
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Table 2. Cont. 



Gene 


Location 


SNP 


MAF 


HR a 


(95% CI) 


P 


KLK15 


upstream 


rs3745523 


0.29 


2.07 


(1.04-4.12) 


3.81 E-02 


BCL2 


intronic 


rsl 2458289 


0.28 


2.27 


(1.04-4.96) 


4.00E-02 


MBL2 


intergenic 


rsl 1003134 


0.20 


8.09 


(1.08-60.37) 


4.16E-02 


BCL10 


intergenic 


rs6693365 


0.30 


2.36 


(1.03-5.39) 


4.18E-02 


SELE 


intronic 


rs3917412 


0.28 


2.13 


(1.03-4.40) 


4.21 E-02 


CD! 80 


intergenic 


rs6890674 


0.15 


2.27 


(1.03-5.02) 


4.29E-02 


MAL 


intronic 


rs31 13002 


0.35 


0.45 


(0.21-0.98) 


4.30E-02 


AICDA 


UTR3 


rsl 1046349 


0.12 


2.81 


(1.03-7.69) 


4.44E-02 


C1QA 


intronic 


rs2935542 


0.14 


2.29 


(1.02-5.13) 


4.49E-02 


IRF4 


intergenic 


rsl 1242867 


0.29 


2.16 


(1.02-4.61) 


4.50E-02 


IL-8 


intergenic 


rs4694178 


0.40 


0.48 


(0.23-0.99) 


4.61 E-02 


MASP1 


intronic 


rs31 05782 


0.15 


2.30 


(1.01-5.24) 


4.70E-02 


MUC2 


intergenic 


rs4077757 


0.03 


3.88 


(1.01-14.90) 


4.80E-02 



"■Multivariate Cox proportional hazard model adjusted for age, estrogen receptor status, progesterone receptor status and TNM stage. 
doi:1 0.1 371 /joumal.pone.01 03593.t002 



assessed, number of polymorphisms assessed, number of patients 
and events including recurrence, death, follow-up period, type of 
outcome, and covariates. Associations between polymorphisms 
and the outcome of each cancer were recorded as HR with 95% 
CI and adjustments. Because different nomenclatures and names 
for polymorphisms were used in the studies, all polymorphisms 
were named by RefSNP (rs) numbers. We followed the Preferred 
Reporting Items for Systematic Review and Meta-Analysis 
(PRISMA) statement and checklist as a methodological template 
for this review (Table SI). 

Results 

Table 1 shows the characteristics of the 107 patients including 
20 patients who had the events. Among the 107 cases, BMI, PR 
status, and TNM stage showed a significant association with the 
prognosis on the DFS of breast cancer (P<0.05, log-rank test), 
while there were no significant differences in age, family history of 
breast cancer, educational level, menopausal status, smoking 
status, alcohol consumption, and ER status. 

The associations of immunity-related genetic factors on DFS of 
breast cancer prognosis are presented in Table 2. Among 1,971 
SNPs, 80 SNPs were significantly associated with the DFS of 
breast cancer. The 62 SNPs were remained after excluding those 
with high LD (r 2 >0.4) and 3 SNPs were still significant at FDR p- 



value less than 0.05. The SNPs were rsl 952438 in SOCS4 gene 
(HR= 11.99, 95% CI = 3.62-39.72, P = 4.84E-05), rs2289278 in 
TSLP gene (HR = 4.25, 95% CI = 2.10-8.62, P = 5.99E-05) and 
rs2074724 in HGF gene (HR = 4.63, 95% CI = 2. 18-9.87, 
P = 7.04E-05). 

Figure 1 presents the Kaplan-Meier survival curve and 
estimated HRs of breast cancer in groups defined by tertile 
derived from the polygenic risk scores of the 107 patients with all 
62 SNPs. The HR was significantly increased as the score 
increased (p for trend = 0.01). The HR of women in the 3 rd tertile 
was 6.78 (95% CI= 1.48-31.06) compared to patients in the 1 st 
tertile of polygenic risk score. Table 3 shows the predictive 
accuracy and validation results of polygenic risk score model. The 
Harrell's C index of total patients is 0.813, and summarized 
Harrell's C index of cross validation is 0.924. 

In GSEA-SNP analysis, our results showed that 18 pathways 
with 62 SNPs in 56 immunity-related genes had significant 
association with the DFS of breast cancer at a^)-value less than 0. 1 
(Table 4); set 'Myc targets 1': targets of c-Myc identified by ChIP 
on chip in cultured cell lines, focusing on E-box-containing genes; 
high affinity bound subset (including BCL2 and NBN, P = 0.04), 
mitochondrial genes; based on literature and sequence annotation 
resources and converted to Affymetrix HG-U133A probe sets 
(including BCL2 and NBN, P = 0.04), genes down-regulated in 
T24 (bladder cancer) cells in response to the photodynamic 



Table 3. Harrell's C index for polygenic risk score estimated by 4-fold cross-validation. 



Group No. of SNPs in CV set Harrell's C index Standard error (95% CI) 



All 




0.813 


0.48 


(0.72-0.91) 


CV setl 


25 


0.885 


0.09 


(0.70-1 .07) 


CV set2 


40 


0.910 


0.06 


(0.78-1 .04) 


CV set3 


32 


0.940 


0.03 


(0.88-1 .00) 


CV set4 


36 


0.909 


0.04 


(0.82-1 .00) 


Summary 3 




0.924 


0.02 


(0.88-0.97) 



a The summary of Harrell's C index for 4 test sets calculated by fixed-effect meta-analysis. 
doi:1 0.1 371 /joumal.pone.01 03593.t003 
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therapy (PDT) stress (including BCL2 and CCL2, P = 0.04), genes 
transiently induced only by the second pulse oiEGF in 184A1 cells 
(mammary epithelium) (including IRF3, TRAF5, KLK15 and 
IL5R, P = 0.02). 

Table 5 showed 30 studies resulted from systematic review for 
survival analyses estimating effects of immune-related genetic 
factors on various cancers. In the studies, eighty eight SNPs in 58 
immunity genes were significantly associated with the prognosis of 
cancer patients (Table 6). In those results, there were 29 genes 
overlapped in both our study and previous studies, but no SNPs 
overlapped. Among them, IL-6R, IL-S, IL-10RB, IL-12A, and 
IL-12B was significandy associated with the prognosis of cancer 
consistent to our finding. 

Discussion 

In this study, we found that the rs 1952438 in the suppressors of 
cytokine signaling (SOCS4) gene, rs2289278 in the thymic stromal 
lymphopoietin (TSLP) gene and rs2074724 in the hepatocyte 
growth factor (HGF) gene were highly associated with a poor 
prognosis of breast cancer. Moreover, the polygenic risk score 
model with genetic variations of immunity-related genes showed 
that the hazard of DFS of patients was significantly increased as 
high-risk alleles accumulated. In the GSEA-SNP analysis, 18 
pathways significantly affected breast cancer prognosis. 

The rsl952438 is located in the intron region of SOCS4 gene. 
SOCS family are rapidly induced by activated STATs and 
negatively regulate JAK/ STAT pathway by a classical feedback 
loop [22]. Furthermore, other signal molecules such as FAK, IRS, 
p65, GR which are related with carcinogenesis, are regulated by 
SOCS proteins [23-27]. In addition, there are several previous 
study which reported that people who have higher expression level 
of SOCS4 are likely remained disease free status compared to 
those who developed recurrence [28]. In the view of previous 
studies which explain functional importance of SOCS4 and results 
of present study, it might be assumed that rs 1952438 is associated 
with poorer prognosis of breast cancer by declining expression 
level of SOCS4. 

The rs2289278 is found in intron 2 of the long-form of TSLP 
and in the 5' untranslated region of the short-form of TSLP [29]. 
TSLP is a member of the IL-2 cytokine family and a distant 
paralog of IL-7. TSLP may have an important role in tumor 
progression by activating CD4+ T cells, inducing the expressing of 
OX40L in dendritic cells (DCs), and producing Th2-type 
cytokines and B-cell growth factor [30] . A recent study has shown 
that breast cancer cells have high expression levels of TSLP, 
indicating that the TSLP may be critical in the development of 
breast cancer [3 1] . It is that high expression level of TSLP in 
cancer increases the Th2 level [30]. Furthermore, Th2 cytokines 
promote disease progres-sion through the increased survival of 
cancer cells, M2 macrophage differentia-tion, and fibrosis [31,32]. 
Thus, TSLP may be an important factor of breast tumor 
progression and the prognosis of a patient. 

The rs2074724 is located in the intron oi HGF. HGF is known 
to activate angiogenesis of tumors as well as cell-cell interactions, 
matrix adhesion, migration, invasion [33]. Moreover, breast 
cancer patients with a high HGF concentration had a significandy 
poor prognosis when compared to those with a low HGF 
concentration [34]. Therefore, HGF level was found to be the 
most important independent factor in predicting the prognosis of 
breast cancer. 

In the GSEA-SNP analysis, there are 18 significant pathways; 
among these pathways, gene set from Kyng et al [35] which 
included rs 1952438 in SOCS4 gene and rs2074724, and 
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Table 6. Genes that have significant SNPs of each study in the review of previous studies. 



Gene 


SNP 


Primary endpoint a 


HR 


(95% CI) 


P 


Type of cancer 15 


Ref 


C7 


rs324058 


EFS 


1.66 


(0.87-3.17) 


0.04 


Lymphoma 


[69] 


C9 


rs 142 1094 


EFS 


0.54 


(0.32-0.90) 


0.02 


Lymphoma 


[69] 


CCR5 


rs 1800940 


OS 


0.73 


(0.53-1.00) 




Lymphoma 


[68] 


CD46 


rs2466571 


EFS 


1.49 


(0.86-2.61) 


0.05 


Lymphoma 


[69] 


CD55 


rs2564978 


EFS 


0.52 


(0.30-0.88) 


<0.01 


Lymphoma 


[69] 


CD80 


rsl 3071 247 


OS 


1.73 


(1.26-2.39) 


<0.01 


Ovarian cancer 


[72] 




rs7804190 


OS 


1.14 


(1.06-1.23) 


<0.01 


Ovarian cancer 


[72] 


CFH 


rs3766404 


EFS 


2.25 


(1.31-3.87) 


<0.01 


Lymphoma 


[69] 




rsl 329423 


EFS 


0.49 


(0.29-0.38) 


<0.01 


Lymphoma 


[69] 


CFHR1 


rs436719 


EFS 


0.57 


(0.34-0.96) 


0.03 


Lymphoma 


[69] 


CFHR5 


rs6694672 


EFS 


2.63 


(1.41-4.92) 


<0.01 


Lymphoma 


[69] 


CLU 


rs3087554 


EFS 


0.46 


(0.21-1.00) 


0.05 


Lymphoma 


[69] 


COX-2 


rs689466 


OS 


0.58 


(0.39-0.86) 


0.01 


NSCLC 


[58] 


ERCC2 


rs238406 


OS 


1.64 


(1.08-2.50) 


0.02 


Esophageal cancer 


[75] 




rs238406 


PFS 


1.76 


(1.17-2.66) 


0.01 


Esophageal cancer 


[75] 




rsl 799793 


BCSS 


1.90 


(1.06-3.26) 


0.04 


Breast cancer 


[53] 




rsl 799793 


EFS 


0.23 


(0.05-0.99) 


0.01 


Osteosarcoma 


[74] 


FasL 


rs763110 


OS 


1.46 


(1.13-1.87) 


<0.01 


NSCLC 


[60] 




rs763110 


RFS 


1.71 


(1.33-2.21) 


<0.01 


NSCLC 


[60] 


GATA3 


rsl 0905278 


OS 


1.82 


(1.31-2.53) 


<0.01 


Pancreatic cancer 


[73] 


IFNAR1 


rs2257167 


EFS 


0.74 


(0.55-1.00) 


0.05 


NSCLC 


[63] 


IFNGR1 


rsl 327474 


OS 


0.69 


(0.50-0.94) 


0.02 


Colorectal cancer 


[56] 




rs9376267 


OS 


1.37 


(1.09-1.73) 


0.01 


Colorectal cancer 


[56] 


IFNGR2 


rs283421 1 


OS 


1.32 


(1.01-1.72) 


0.04 


Colorectal cancer 


[56] 




rs2834213 


OS 


2.04 


(1.16-3.57) 


0.01 


Colorectal cancer 


[56] 


IFNW1 


rsl 0964859 


OS 


1.80 


(1.02-3.16) 


0.04 


Melanoma 


[71] 


IL-WRB 


rs8128184 


EFS 


1.59 


(1.11-2.29) 


0.01 


NSCLC 


[63] 


IL-12A 


rs2243148 


EFS 


1.28 


(1.03-1.58) 


0.03 


NSCLC 


[63] 


IL-12B 


rs3212227 


OS 


1.83 


(1.09-3.06) 


<0.01 


Lymphoma 


[42] 


IL-13 


rsl 295683 


EFS 


1.39 


(1.03-1.87) 


0.03 


NSCLC 


[63] 


IL-1A 


rs3783546 


OS 


2.07 


(1.28-3.36) 


0.02 


Colorectal cancer 


[57] 




rsl 800587 


OS 


1.90 


(1.26-2.87) 


<0.01 


Lymphoma 


[70] 


IL-1B 


rs 1143623 


OS 


1.37 


(1.09-1.72) 


0.01 


Colorectal cancer 


[57] 




rsl 143627 


OS 


0.50 


(0.30-1.00) 


0.04 


Myeloma 


[77] 


IL-1RN 


rs454078 


OS 


1.93 


(1.11-3.34) 


0.03 


Lymphoma 


[42] 


IL-2 


rs2069763 


OS 


1.43 


(1.15-3.82) 




Breast cancer 


[13] 




rs2069762 


OS 


1.80 


(1.06-3.05) 


0.01 


Lymphoma 


[42] 


IL-21 


rsl 2508721 


OS 


0.45 


(0.30-0.67) 


<0.01 


Breast cancer 


[12] 


IL-23R 


rs6682925 


OS 


1.34 


(1.05-1.70) 




NSCLC 


[59] 


IL-3 


rs181781 


OS 


2.47 


(1.11-5.53) 


0.03 


Colorectal cancer 


[57] 


IL-5 


rs2069807 


OS 


4.56 


(1.98-10.5) 


<0.01 


Lymphoma 


[70] 




rs2069818 


OS 


5.58 


(1.66-18.6) 


0.01 


Lymphoma 


[42] 


IL-5R 


rsl 1713419 


OS 


6.60 


(2.42-18.02) 




NSCLC 


[59] 


IL-6 


1)1 


OS 


0 42 


(0 23-0 77) 




1 \/ m nhn m ^ 
i_y M 1 [J 1 1 \J 1 1 1 el 


Looj 




rsl 800797 


DFS 


1.60 


(1.09-2.35) 


0.02 


Breast cancer 


[14] 


IL-6R 


rs4240872 


EFS 


0.75 


(0.59-0.95) 


0.02 


NSCLC 


[63] 


IL-8 


rs4073 


OS 


2.14 


(1.26-3.63) 




Lymphoma 


[42] 




rs2227307 


OS 


1.90 


(1.12-3.22) 




Lymphoma 


[42] 




rs2227306 


OS 


1.96 


(1.07-3.28) 




Lymphoma 


[42] 




rsl 2506479 


EFS 


1.32 


(1.08-1.62) 


0.01 


NSCLC 


[63] 
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Table 6. Cont. 



Gene 


SNP 


Primary endpoint a 


HR 


(95% CI) 


P 


Type of cancer 15 


Ref 


IL-8RS 


rsl 126579 


OS 


1.61 


(1.05-2.46) 


0.02 


Colorectal cancer 


[57] 




rsl 126580 


OS 


2.11 


(1.28-3.50) 


<0.01 


Lymphoma 


[70] 


IRF2 


rsl 2504466 


OS 


1.51 


(1.14-1.99) 


<0.01 


Colorectal cancer 


[56] 




rs131 16389 


OS 


1.38 


(1.09-1.75) 


0.01 


Colorectal cancer 


[56] 




rs2797507 


OS 


0.77 


(0.61-0.98) 


0.03 


Colorectal cancer 


[56] 




rs3775582 


OS 


0.67 


(0.50-0.89) 


0.01 


Colorectal cancer 


[56] 




rs7655800 


OS 


1.33 


(1.04-1.70) 


0.02 


Colorectal cancer 


[56] 




rs793801 


OS 


1.39 


(1.01-1.91) 


0.04 


Colorectal cancer 


[56] 




rs1425551 


OS 


1.50 


(1.03-2.18) 


0.04 


Colorectal cancer 


[56] 




rs3756094 


OS 


0.36 


(0.20-0.66) 


<0.01 


Colorectal cancer 


[56] 




rs3822118 


OS 


1.47 


(1.08-2.01) 


0.02 


Colorectal cancer 


[56] 




rs807684 


OS 


0.30 


(0.14-0.66) 


<0.01 


Colorectal cancer 


[56] 




rs 1044873 


OS 


1.32 


(1.04-1.68) 


0.03 


Colorectal cancer 


[56] 




rs305083 


OS 


1.31 


(1.04-1.65) 


0.02 


Colorectal cancer 


[56] 


IRF6 


rs2013196 


OS 


1.29 


(1.02-1.63) 


0.03 


Colorectal cancer 


[56] 


LRRC32 


rs3781699 


OS 


2.32 


(1.45-3.71) 


<0.01 


Ovarian cancer 


[72] 




rs7944357 


OS 


2.04 


(1.34-3.10) 


<0.01 


Ovarian cancer 


[72] 


MBL2 


rs7096206 


OS 


0.55 


(0.42-0.73) 


<0.01 


NSCLC 


[65] 


MET 


rsl 1762213 


RFS 


1.86 


(1.17-2.95) 


0.01 


Renal cell cancer 


[67] 


NFKB 


rs71 57810 


OS 


1.43 


(1.16-1.75) 


<0.01 


Pancreatic cancer 


[73] 


N0D2 


rs9302752 


OS 


3.19 


(2.04-4.34) 


WM 


Bladder cancer 


[66] 


N0S3 


rsl 799983 


OS 


1.39 


(1.14-1.70) 


<0.01 


Pancreatic cancer 


[73] 


REG4 


rs2994809 


DFS 


2.00 


(1.18-3.39) 


0.01 


Colorectal cancer 


[54] 




rs299481 1 


OS 


1.35 


(1.02-1.78) 


0.03 


Colorectal cancer 


[54] 


RGS1 


rsl 0921 202 


OS 


2.93 


(1.77-4.84) 


<0.01 


Ovarian cancer 


[72] 


RIPK1 


rs2326173 


OS 


1.44 


(1.20-1.74) 


<0.01 


Pancreatic cancer 


[73] 


S0CS3 


rs8064821 


OS 


0.65 


(0.49-0.87) 


<0.01 


Pancreatic cancer 


[73] 


STAT1 


rsl 2693591 


OS 


0.68 


(0.55-0.86) 


<0.01 


Pancreatic cancer 


[73] 


TGF-fSI 


rs10469 


OS 


1.46 


(1.01-2.11) 


0.04 


NSCLC 


[61] 




rsl 982073 


DMFS 


1.59 


(1.01-2.50) 


0.05 


NSCLC 


[61] 




rsl 982073 


DFS 


3.23 


(1.19-8.77) 


0.02 


HNSCC 


[76] 




rsl 800469 


OS 


0.46 


(0.25-0.87) 


0.02 


NSCLC 


[62] 


TGFBR1 


rs10512263 


EFS 


0.59 


(0.37-0.94) 


0.03 


NSCLC 


[63] 




rs868 


EFS 


1.28 


(1.01-1.61) 


0.04 


NSCLC 


[63] 


TGFBR2 


rs2043136 


EFS 


0.74 


(0.58-0.95) 


0.02 


NSCLC 


[63] 


TLR1 


rs5743551 


OS 


0.78 


(0.62-0.97) 


- 


NSCLC 


[59] 


TLR 10 


rs41 29009 


OS 


0.49 


(0.18-0.80) 




Bladder cancer 


[66] 


TLR3 


rs3775291 


OS 


1.93 


(1.14-3.28) 




Colorectal cancer 


[55] 




rs3775291 


OS 


1.37 


(1.09-1.73) 




NSCLC 


[59] 


TLR4 


rsl 1536889 


OS 


1.38 


(1.09-3.12) 


0.02 


Breast cancer 


[11] 


TNFRSF10B 


rsl 1785599 


EFS 


1.41 


(1.16-1.70) 


<0.01 


NSCLC 


[63] 


TNFRSF1B 


rs1061622 


OS 


0.38 


(0.15-0.94) 


0.04 


NSCLC 


[64] 


TNFRSF4 


rs3753348 


OS 


3.41 


(1.65-7.05) 


<0.01 


Ovarian cancer 


[72] 



a EFS, event free survival; OS, overall survival; RFS, relapse free survival; DFS, disease free survival; DMFS, distant metastasis-free survival; BCSS, breast cancer specific 
survival. 

b DLBCL, diffuse large B-cell lymphoma; NSCLC, non-small cell lung cancer; FL, follicular lymphoma; HNSCC, head and neck squamous cell carcinoma. 
doi:1 0.1 371 /journal.pone.01 03593.t006 



rs5745752 in HGF gene is described that environmental stress expected that those SNPs included in the pathway can up-regulate 
such as 4-nitroquinoline- 1 -oxide (4NQO) elicited DNA damage breast cancer progression and result in poor prognosis by 
specific gene expression changes of up to 10. In short, it can be 
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influencing on environmental response, although there are not 
precise result in this assumption. 

'Myc tagetsP gene set from Benporath et al [36] which included 
rs 12458289 and rs9989529 in BCL2 gene, and rs2 142097 in NBN 
gene is shown as the most significant gene set. Benporath et al 
describe that targets of Nanog, Oct4, Sox2 and c-Myc are more 
frequently associated in poorly differentiated tumors than in well- 
differentiated tumors. c-Myc is well known to directly regulate the 
expression of NBN gene involved in DNA double-strand break 
repair and can result in chromosomal instability, cellular 
proliferation defects leading to increased more aggressive and 
metastatic tumor latency [37,38]. BCL2 and c-Myc are known to 
make the negative feedback loop in breast cancer cell line [39]. 
Taking all these consideration of both Benporath et al and results 
of present study to account, it is can be deduced that rs 12458289, 
rs9989529, and rs2 142097 might be associated with the prognosis 
of breast cancer by interacting with c-MYC gene. 

To support the indirectly functional effects of our results, we 
attempted to find potential functional SNPs in SOCS4, HGF, 
TSLP and genes included in GSEA-SNP using UCSC database 
[40] and checked the LD between the potential functional SNPs 
and our findings. Table S2 show the functional SNPs studied in 
this study and functional SNPs in LD with those SNPs, generally 
to affect histone modification, DNA methylation, and binding 
affinity of several transcription factors located in 5'UTR or 
3'UTR. For example, transcription activity of IL-8 is influenced 
by rs4073 which located in promoter region of IL-8 [41] and the 
variant increased the risk of mortality in follicular lymphocytic 
leukemia by increasing production of IL-8 [42] . As a result, it is 
possibly anticipated that those potential SNPs may influence to 
breast cancer prognosis by regulating the epigenetic and 
transcriptional pathway. 

Several previous reports have evaluated the associations of 
immunity gene polymorphism and breast cancer prognosis [1 1- 
14]. They suggested that the variants oiERCC2, TLR4, IL-2, IL- 
6, and IL-21 genes had associations with breast cancer prognosis 
respectively. However, those genes were not replicated in present 
study. In the other types of cancer studies, IL-6R, IL-8, IL-10RB, 
IL-12A, and IL-12B genes were consistently associated with 
cancer prognosis between our study and theirs. However, there 
were few consistent SNPs with cancer prognosis in our review of 
the literature, which may result from various cancer targets, 
different ethnicities, and different prognostic factors in the models 
and statistical power. 
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